Skip to main content

Advanced SQL Statements

SQL processes queries in this logical order:

  1. FROM → get the data sources.
  2. WHERE → filter individual rows.
  3. GROUP BY → group rows.
  4. HAVING → filter groups.
  5. SELECT → calculate aggregates and select columns.
  6. ORDER BY → sort the result set.
  7. LIMIT / OFFSET (or FETCH FIRST N ROWS) → return only a subset of rows.

Aggregate Functions

Aggregate functions in SQL are functions that perform calculations on multiple rows of data and return a single value.

They are often used with the GROUP BY clause to perform calculations on grouped data.

Common Aggregate Functions

FunctionDescription
COUNT()Counts the number of rows (or non-NULL values if column is specified)
SUM()Calculates the total of a numeric column
AVG()Calculates the average of a numeric column
MIN()Finds the smallest value in a column
MAX()Finds the largest value in a column

General Syntax

SELECT AGGREGATE_FUNCTION(column_name)
FROM table_name
WHERE condition;
  • Can be combined with GROUP BY to apply per group.
  • Can be combined with HAVING to filter groups based on aggregated values.

Example

order_idcustomer_idproductquantitypriceorder_date
1C001Laptop110002025-08-01
2C002Mouse2252025-08-01
3C001Keyboard1452025-08-02
4C003Laptop19502025-08-02
5C002Mouse3252025-08-03

Problem: For each customer, how many orders have they placed, and how many total items have they bought?

Solution:

SELECT customer_id,
COUNT(*) AS total_orders,
SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id;

Output

customer_idtotal_orderstotal_items
C00122
C00225
C00311
  • Aggregate functions always return one value per group.
  • You cannot pass multiple columns into a single aggregate directly.
  • Use CASE expressions inside aggregates for conditional calculations.

Total Revenue per Customer

SELECT customer_id,
SUM(quantity * price) AS total_revenue
FROM orders
GROUP BY customer_id;

Count Products Separately

SELECT customer_id,
SUM(CASE WHEN product = 'Laptop' THEN quantity ELSE 0 END) AS total_laptops,
SUM(CASE WHEN product = 'Mouse' THEN quantity ELSE 0 END) AS total_mice
FROM orders
GROUP BY customer_id;

Grouping Data

GROUP BY in SQL is used to arrange identical data into groups so you can apply aggregate functions (like COUNT, SUM, AVG, MIN, MAX) on each group separately.

Think of it as saying: Take all the rows, group them based on this column (or set of columns), and then summarize each group.

General Syntax

SELECT column1, AGGREGATE_FUNCTION(column2)
FROM table_name
WHERE condition
GROUP BY column1;
  • Every column in SELECT that is not inside an aggregate function must be listed in GROUP BY.
  • GROUP BY comes after WHERE but before HAVING and ORDER BY.

Example

order_idcustomer_idproductquantitypriceorder_date
1C001Laptop110002025-08-01
2C002Mouse2252025-08-01
3C001Keyboard1452025-08-02
4C003Laptop19502025-08-02
5C002Mouse3252025-08-03
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id;

Output:

customer_idtotal_orders
C0012
C0022
C0031
  • GROUP BY customer_id → All rows with the same customer_id are combined into one group.
  • COUNT(*) counts how many orders are in each group

Key Points to Remember for GROUP BY

  • Use GROUP BY when you want one row per group instead of one row per record.
  • All non-aggregated columns in SELECT must be in GROUP BY.
  • HAVING is for group-level filtering; WHERE is for row-level filtering.
  • You can group by multiple columns or expressions.

Filtering Groups

HAVING is used to filter groups after aggregation has happened. It works with GROUP BY and aggregate functions like COUNT, SUM, AVG, MIN, MAX. WHERE filters rows before grouping, while HAVING filters aggregated results after grouping.

Syntax of HAVING

SELECT column1, AGGREGATE_FUNCTION(column2)
FROM table_name
WHERE row_condition
GROUP BY column1
HAVING aggregate_condition
ORDER BY column1;

Example of HAVING

Why not use WHERE?

Let’s say we want to find customers who ordered more than 2 items total.

This fails:

SELECT customer_id, SUM(quantity)
FROM orders
WHERE SUM(quantity) > 2 -- ❌ INVALID: WHERE cannot use aggregates
GROUP BY customer_id;

Because SUM(quantity) doesn’t exist until after grouping.

This works:

SELECT customer_id, SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id
HAVING SUM(quantity) > 2;

Output:

customer_idtotal_items
C0025

Explanation

  1. Group orders by customer_id.
  2. Calculate SUM(quantity) for each group.
  3. Keep only groups where SUM(quantity) > 2.

HAVING without GROUP BY

You can use HAVING without GROUP BY. In that case, it treats the entire table as one group.

Example: Find if total sales exceed 2000:

SELECT SUM(quantity * price) AS total_sales
FROM orders
HAVING SUM(quantity * price) > 2000;

It will show total_sales is 2120..

Multiple Conditions in HAVING

You can filter on more than one aggregate condition.

Example: Customers who ordered more than 2 items and placed at least 2 orders:

SELECT customer_id,
COUNT(order_id) AS order_count,
SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id
HAVING SUM(quantity) > 2
AND COUNT(order_id) >= 2;

WHERE vs HAVING

FeatureWHEREHAVING
Filters rows✅ Yes❌ No
Filters groups❌ No✅ Yes
Can use aggregates❌ No✅ Yes
ExecutesBefore groupingAfter grouping

Join

A JOIN combines rows from two or more tables based on a related column (usually a foreign key relationship).

You tell SQL:

Match rows from these tables where a specific condition is true, and return the combined data.

Types of Joins

Join TypeReturns…
INNER JOINOnly rows where there’s a match in both tables
LEFT JOINAll rows from the left table, plus matched rows from the right (NULL if no match)
RIGHT JOINAll rows from the right table, plus matched rows from the left (NULL if no match)
FULL OUTER JOINAll rows from both tables, matched where possible, NULL where no match

Example Tables for Join

customers

customer_idnamecountry
1AliceUSA
2BobUK
3CharlieCanada
4DianaUSA

orders

order_idcustomer_idproductamount
1011Laptop1200
1022Mouse25
1031Keyboard45
1043Monitor200
1055Desk Chair150

Use the smallest dataset first.

INNER JOIN

Returns only customers who have orders.

SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id;
customer_idnameproductamount
1AliceLaptop1200
1AliceKeyboard45
2BobMouse25
3CharlieMonitor200
  • INNER JOIN only includes rows where customer_id exists in both customers and orders.
  • Diana is excluded (no order), and order 105 is excluded (customer_id=5 not in customers).

LEFT JOIN

Returns all customers, plus their orders if any.

SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id;
customer_idnameproductamount
1AliceLaptop1200
1AliceKeyboard45
2BobMouse25
3CharlieMonitor200
4DianaNULLNULL
  • Diana shows up with NULL values for product and amount because she has no orders.
  • This is useful when you want all left-side records regardless of matches.

RIGHT JOIN

Returns all orders, plus customer details if any.

SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
RIGHT JOIN orders o
ON c.customer_id = o.customer_id;
customer_idnameproductamount
1AliceLaptop1200
1AliceKeyboard45
2BobMouse25
3CharlieMonitor200
NULLNULLDesk Chair150
  • Order 105 appears, but since customer_id=5 doesn’t exist in customers, customer details are NULL.
  • This is essentially the mirror of LEFT JOIN.

FULL OUTER JOIN

Returns all customers and all orders, matched where possible.

SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
FULL OUTER JOIN orders o
ON c.customer_id = o.customer_id;
customer_idnameproductamount
1AliceLaptop1200
1AliceKeyboard45
2BobMouse25
3CharlieMonitor200
4DianaNULLNULL
NULLNULLDesk Chair150
  • Combines the results of LEFT JOIN and RIGHT JOIN.
  • Shows unmatched customers (like Diana) and unmatched orders (like Desk Chair).

Always use explicit JOIN conditions, not implicit joins via the WHERE clause.

-- Avoid implicit join in WHERE clause:
SELECT orders.order_id, customers.customer_name
FROM orders, customers
WHERE orders.customer_id = customers.customer_id;

-- Use explicit JOIN instead:
SELECT orders.order_id, customers.customer_name
FROM orders
JOIN customers ON orders.customer_id = customers.customer_id;

Implicit join will return a Cartesian product, meaning every order will be combined with every customer, which could lead to a massive number of rows in the result set if the tables are large.

Visual Summary

 INNER JOIN:        LEFT JOIN:         RIGHT JOIN:        FULL OUTER JOIN:

A ∩ B A ⊃ (A ∩ B) (A ∩ B) ⊂ B A ∪ B
  • INNER JOIN → Only the overlap
  • LEFT JOIN → Everything from left table + matches
  • RIGHT JOIN → Everything from right table + matches
  • FULL OUTER JOIN → Everything from both tabl

Subquery (Nested Query)

A Subquery is a query inside another query.

  • It is enclosed in parentheses ( ).
  • It can appear in the SELECT, FROM, WHERE, or HAVING clauses.
  • The main query (outer query) uses the result of the subquery (inner query).

When to Use Subqueries?

  • To break complex problems into smaller, manageable queries.
  • When joins are not straightforward.
  • To check existence (EXISTS), compare aggregates, or filter with conditions.

Types of Subqueries

Single-row Subquery

  • Returns only one row.
  • Usually used with comparison operators: =, <, >, <=, >=, <>.

Example:

Find students who have the same department as Alice.

SELECT student_name
FROM Students
WHERE department_id = (
SELECT department_id
FROM Students
WHERE student_name = 'Alice'
);
  • Inner query: Finds Alice’s department_id.
  • Outer query: Returns all students in that department.

Multi-row Subquery

  • Returns multiple rows.
  • Used with operators like IN, ANY, ALL.

Example:

Find students enrolled in any course in the Science department

SELECT student_name
FROM Students
WHERE student_id IN (
SELECT student_id
FROM Enrollments
WHERE course_id IN (
SELECT course_id
FROM Courses
WHERE department_id = (
SELECT department_id
FROM Departments
WHERE department_name = 'Science'
)
)
);
  • Inner-most query: Gets the department_id of "Science".
  • Next query: Gets all course_id belonging to "Science".
  • Next query: Gets student_id of students enrolled in those courses.
  • Outer query: Finds student names from Students.

Correlated Subquery

  • Inner query depends on the outer query.
  • Runs once for each row of the outer query.

Example:

Find students who are enrolled in more than one course.

SELECT s.student_name
FROM Students s
WHERE (
SELECT COUNT(*)
FROM Enrollments e
WHERE e.student_id = s.student_id
) > 1;
  • For each student in Students, the subquery counts how many courses they are enrolled in.
  • Outer query selects only those with count > 1.

Subqueries in Different Clauses

  1. In SELECT clause (Scalar Subquery)

    SELECT student_name,
    (SELECT department_name
    FROM departments d
    WHERE d.department_id = s.department_id) AS dept_name
    FROM students s;

    Adds department name directly in the result without a join.

  2. In FROM clause (Derived Table / Inline View)

    SELECT dept_id, COUNT(*) AS total_students
    FROM (
    SELECT department_id AS dept_id
    FROM students
    ) AS temp
    GROUP BY dept_id;

    Subquery acts like a temporary table (temp).

  3. In HAVING clause

    SELECT course_id, COUNT(student_id) AS total_enrolled
    FROM enrollments
    GROUP BY course_id
    HAVING COUNT(student_id) > (
    SELECT AVG(total)
    FROM (
    SELECT COUNT(student_id) AS total
    FROM enrollments
    GROUP BY course_id
    ) AS sub
    );

    Finds courses where enrollment is greater than the average enrollment across all courses.

Subquery vs Join

Subqueries (especially in the WHERE clause) can often be inefficient, especially if they are correlated subqueries (i.e., where the subquery references columns from the outer query). Whenever possible, replace subqueries with JOIN or EXISTS clauses.

  • Subquery: More readable for step-by-step logic, but sometimes slower.
  • Join: Usually faster for large datasets, but can be harder to read.

Example: Both return students enrolled in “Math”:

Using Subquery:

SELECT student_name
FROM Students
WHERE student_id IN (
SELECT student_id
FROM Enrollments e
JOIN Courses c ON e.course_id = c.course_id
WHERE c.course_name = 'Math'
);

Using Join:

SELECT DISTINCT s.student_name
FROM Students s
JOIN Enrollments e ON s.student_id = e.student_id
JOIN Courses c ON e.course_id = c.course_id
WHERE c.course_name = 'Math';

UNION in MySQL

The UNION operator is used to combine the results of two or more SELECT queries into a single result set.

Rules of UNION:

  1. Each SELECT must have the same number of columns.
  2. The data types of corresponding columns must be compatible.
  3. By default, UNION removes duplicate rows.
  4. Use UNION ALL if you want to keep duplicates.

Example of UNION

-- Students table
CREATE TABLE Students (
student_id INT,
name VARCHAR(100),
email VARCHAR(100)
);

-- Teachers table
CREATE TABLE Teachers (
teacher_id INT,
name VARCHAR(100),
email VARCHAR(100)
);

-- Query using UNION
SELECT name, email FROM Students
UNION
SELECT name, email FROM Teachers;
  • Combines student and teacher contacts into one list.
  • If a student and teacher share the same email, it will appear only once.

Avoid Using OR in WHERE Clauses

OR can cause MySQL to perform more work than necessary, especially if it causes full table scans. When possible, try to rewrite the query to avoid OR.

-- Avoid using OR in WHERE clause:
SELECT order_id, total_amount FROM orders WHERE total_amount < 100 OR total_amount > 500;

-- Rewrite the query with a UNION:
SELECT order_id, total_amount FROM orders WHERE total_amount < 100
UNION
SELECT order_id, total_amount FROM orders WHERE total_amount > 500;

Example with UNION ALL:

SELECT name, email FROM Students
UNION ALL
SELECT name, email FROM Teachers;

Same as above, but duplicates are kept.

INTERSECT in MySQL

Unlike some databases (like PostgreSQL, Oracle, SQL Server), MySQL does NOT support INTERSECT directly.

But we can simulate it using INNER JOIN or IN clause.

Example of INTERSECT

-- Using INNER JOIN
SELECT s.name, s.email
FROM students s
INNER JOIN teachers t ON s.email = t.email;

OR using subquery:

SELECT name, email FROM Students
WHERE email IN (SELECT email FROM Teachers);
  • Finds records that exist in both tables (intersection).
  • Works like INTERSECT.

Comparing UNION vs INTERSECT

FeatureUNIONINTERSECT (simulated in MySQL)
Combines rows✅ Yes❌ No (filters only common rows)
Removes duplicates✅ By default✅ Naturally (since only common rows)
Keeps duplicates✅ With UNION ALL❌ Not applicable
Availability in MySQL✅ Native support❌ Must simulate using JOIN or IN

Example of UNION and INTERSECT

Suppose we have two mailing lists:

  • event_registrations (people who registered for an event).
  • newsletter_subscribers (people who subscribed to the newsletter).
  1. People in either list (UNION):
    SELECT email FROM event_registrations
    UNION
    SELECT email FROM newsletter_subscribers;
  2. People in both lists (INTERSECT):
    SELECT email FROM event_registrations
    WHERE email IN (SELECT email FROM newsletter_subscribers);
  • UNION gives everyone we can contact.
  • INTERSECT gives our most engaged people (registered + subscribed).